AITopics | empirical score

Collaborating Authors

empirical score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training

Neural Information Processing SystemsJun-22-2026, 19:56:15 GMT

Diffusion models have achieved remarkable success across a wide range of generative tasks. A key challenge is understanding the mechanisms that prevent their memorization of training data and allow generalization. In this work, we investigate the role of the training dynamics in the transition from generalization to memorization. Through extensive experiments and theoretical analysis, we identify two distinct timescales: an early time τgen at which models begin to generate high-quality samples, and a later time τmem beyond which memorization emerges. Crucially, we find that τmem increases linearly with the training set size n, while τgen remains constant. This creates a growing window of training times with n where models generalize effectively, despite showing strong memorization if training continues beyond it. It is only when nbecomes larger than a model-dependent threshold that overfitting disappears at infinite training times. These findings reveal a form of implicit dynamical regularization in the training dynamics, which allow to avoid memorization even in highly overparameterized settings. Our results are supported by numerical experiments with standard U-Net architectures on realistic and synthetic datasets, and by a theoretical analysis using a tractable random features model studied in the high-dimensional limit.

artificial intelligence, machine learning, memorization, (19 more...)

Neural Information Processing Systems

Country: Europe > France (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Selective Underfitting in Diffusion Models

Song, Kiwhan, Kim, Jaeyeon, Chen, Sitan, Du, Yilun, Kakade, Sham, Sitzmann, Vincent

arXiv.org Artificial IntelligenceOct-3-2025

Diffusion models have emerged as the principal paradigm for generative modeling across various domains. During training, they learn the score function, which in turn is used to generate samples at inference. They raise a basic yet unsolved question: which score do they actually learn? In principle, a diffusion model that matches the empirical score in the entire data space would simply reproduce the training data, failing to generate novel samples. Recent work addresses this question by arguing that diffusion models underfit the empirical score due to training-time inductive biases. In this work, we refine this perspective, introducing the notion of selective underfitting: instead of underfitting the score everywhere, better diffusion models more accurately approximate the score in certain regions of input space, while underfitting it in others. We characterize these regions and design empirical interventions to validate our perspective. Our results establish that selective underfitting is essential for understanding diffusion models, yielding new, testable insights into their generalization and generative performance.

artificial intelligence, diffusion model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.01378

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Kernel-Smoothed Scores for Denoising Diffusion: A Bias-Variance Study

Gabriel, Franck, Ged, François, Veiga, Maria Han, Schertzer, Emmanuel

arXiv.org Machine LearningMay-30-2025

Diffusion models now set the benchmark in high-fidelity generative sampling, yet they can, in principle, be prone to memorization. In this case, their learned score overfits the finite dataset so that the reverse-time SDE samples are mostly training points. In this paper, we interpret the empirical score as a noisy version of the true score and show that its covariance matrix is asymptotically a re-weighted data PCA. In large dimension, the small time limit makes the noise variance blow up while simultaneously reducing spatial correlation. To reduce this variance, we introduce a kernel-smoothed empirical score and analyze its bias-variance trade-off. We derive asymptotic bounds on the Kullback-Leibler divergence between the true distribution and the one generated by the modified reverse SDE. Regularization on the score has the same effect as increasing the size of the training dataset, and thus helps prevent memorization. A spectral decomposition of the forward diffusion suggests better variance control under some regularity conditions of the true data distribution. Reverse diffusion with kernel-smoothed empirical score can be reformulated as a gradient descent drifted toward a Log-Exponential Double-Kernel Density Estimator (LED-KDE). This perspective highlights two regularization mechanisms taking place in denoising diffusions: an initial Gaussian kernel first diffuses mass isotropically in the ambient space, while a second kernel applied in score space concentrates and spreads that mass along the data manifold. Hence, even a straightforward regularization-without any learning-already mitigates memorization and enhances generalization. Numerically, we illustrate our results with several experiments on synthetic and MNIST datasets.

artificial intelligence, empirical score, machine learning, (18 more...)

arXiv.org Machine Learning

2505.22841

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Ohio (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Add feedback

Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training

Bonnaire, Tony, Urfin, Raphaël, Biroli, Giulio, Mézard, Marc

arXiv.org Machine LearningMay-26-2025

Diffusion models have achieved remarkable success across a wide range of generative tasks. A key challenge is understanding the mechanisms that prevent their memorization of training data and allow generalization. In this work, we investigate the role of the training dynamics in the transition from generalization to memorization. Through extensive experiments and theoretical analysis, we identify two distinct timescales: an early time $τ_\mathrm{gen}$ at which models begin to generate high-quality samples, and a later time $τ_\mathrm{mem}$ beyond which memorization emerges. Crucially, we find that $τ_\mathrm{mem}$ increases linearly with the training set size $n$, while $τ_\mathrm{gen}$ remains constant. This creates a growing window of training times with $n$ where models generalize effectively, despite showing strong memorization if training continues beyond it. It is only when $n$ becomes larger than a model-dependent threshold that overfitting disappears at infinite training times. These findings reveal a form of implicit dynamical regularization in the training dynamics, which allow to avoid memorization even in highly overparameterized settings. Our results are supported by numerical experiments with standard U-Net architectures on realistic and synthetic datasets, and by a theoretical analysis using a tractable random features model studied in the high-dimensional limit.

artificial intelligence, machine learning, memorization, (18 more...)

arXiv.org Machine Learning

2505.17638

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Losing dimensions: Geometric memorization in generative diffusion

Achilli, Beatrice, Ventura, Enrico, Silvestri, Gianluigi, Pham, Bao, Raya, Gabriel, Krotov, Dmitry, Lucibello, Carlo, Ambrogioni, Luca

arXiv.org Machine LearningOct-11-2024

Generative diffusion processes are state-of-the-art machine learning models deeply connected with fundamental concepts in statistical physics. Depending on the dataset size and the capacity of the network, their behavior is known to transition from an associative memory regime to a generalization phase in a phenomenon that has been described as a glassy phase transition. Here, using statistical physics techniques, we extend the theory of memorization in generative diffusion to manifold-supported data. Our theoretical and experimental findings indicate that different tangent subspaces are lost due to memorization effects at different critical times and dataset sizes, which depend on the local variance of the data along their directions. Perhaps counterintuitively, we find that, under some conditions, subspaces of higher variance are lost first due to memorization effects. This leads to a selective loss of dimensionality where some prominent features of the data are memorized without a full collapse on any individual training point. We validate our theory with a comprehensive set of experiments on networks trained both in image datasets and on linear manifolds, which result in a remarkable qualitative agreement with the theoretical predictions.

diffusion model, singular value, singular value normalized singular value, (14 more...)

arXiv.org Machine Learning

2410.08727

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Extrapolating false alarm rates in automatic speaker verification

Sholokhov, Alexey, Kinnunen, Tomi, Vestman, Ville, Lee, Kong Aik

arXiv.org Machine LearningAug-8-2020

Automatic speaker verification (ASV) vendors and corpus In this study we improve upon the generative model presented providers would both benefit from tools to reliably extrapolate in [3]. Despite demonstrating expected overall trends, performance metrics for large speaker populations without collecting the predicted false alarm rates were substantially overestimated, new speakers. We address false alarm rate extrapolation particularly at high ASV thresholds (proxies of high-security under a worst-case model whereby an adversary identifies the applications). To tackle this shortcoming, we propose a discriminative closest impostor for a given target speaker from a large population.

artificial intelligence, impostor, machine learning, (20 more...)

arXiv.org Machine Learning

2008.0359

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States (0.04)
Europe > Finland > North Karelia > Joensuu (0.04)
(4 more...)

Genre: Research Report > New Finding (0.49)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.86)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.72)

Add feedback